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Target Audience 

With the exception of the Executive Overview, this White Paper is intended for readers with a graduate 
level (or equivalent) background in mathematics and computer science. For a complete appreciation of 
the BJskScape™ Intelligent Financial Risk Management Application, some background in quantitative 
finance is required. An understanding of cmrent risk control methodologies such as Value-at-Risk and 
Portfolio Stress-Testing (as well as their limitations) is necessary. 
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Abstract of the White Paper 

This White Paper advances and details the concept of implementing a novel software architecture on a 
heterogeneous network of conventional computational platforms to create a Netcenttic Virtual 
Supercomputer Infrastructure (NVSI). In order to provide a full appreciation for the commercial 
importance of this breakthrough enabling technology, the Paper also describes one possible application 
of the NVSI, named RiskScape, The RiskScape application addresses a particular computationally- 
intensive problem in financial nsk management known as Portfolio Stress-Testing, 

We begin by discussing why this technology represents a breakthrough, and we then enumerate the 
several innovations that we have brought to bear in order to achieve this breakthrough. We next discuss 
the nature of how NVSI accomplishes its radical level of performance without special-purpose 
hardware. We then review the reasons why we believe no one has heretofore taken this approach to 
computer system design. , 

The paper continues with a mathematical treatment of the RzskScape design in order to demonstrate the 
power of the NVSI technology. In this section we outUne the assumptions we have made to make the 
problem computationally tractable, as well as discuss several of the optimization techniques that the 
NVSI platform provides to the application developer. 

Finally we make a comparison of the NVSI technology with hardware-based supercomputers. This 
comparison addresses both the classes of problems to which NVSI is suited, as well as a discussion of 
those to which it is not. The last comparison is in price/ performance. Here we show the estimated 
computational throughput of the NVSI system in terms of vflops (Virtual Floating-point Operations 
per Second), and compare this with two Cray Research® machines, the T90 and the T3E, 

Our findings conclude that for certain classes of problems for which pre-computation is a viable 
methodology, the NVSI/ RJskSc(^e solution is capable of about 250 (Virtual) GigaFLOPS. This 
performance is comparable to that of the Cray T3E, Most importandy from a commercial perspective, 
we estimate that the cost of an NVSI /RJskS cape implementation will be on the order of 50 to 100 times 
less than a comparable hardware-based solution. 
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Executive Overview 

Netcentric computing is the new paradigm for building efficient, cost effective computer systems to 
solve numerous business problems. Many commercial enterprises have invested substantial sums in 
computer hardware only to discover that they realize a fraction of the total CPU power. This is because 
the operating costs of a piece of hardware are identical whether the machine is miming at its peak 
capacity, or sitting idle. The problem is compounded by the fact that while one machine is sitting idle, 
another in the same office is so overloaded that it has slowed to a crawl. What is needed is a way to 
allow the power of idle, or underutilized, machines to automatically augment the capacity of those that 
are over burdened. Such a solution would allow businesses to add hardware in an incremental fashion, 
rather than having to continuously upgrade expensive servers and mainframes. 

This White Paper describes a breakthrough, software-based enabling technology that transforms a 
network of conventional PCs, workstations and servers into a virtual supercomputer designed for 
optimal performance over a wide class of conmiercial domains. Quite general in nature, this virtual 
supercomputer can be used to solve many (although certainly not all) computationally intensive 
problems that are found in a number of different businesses. These include telecommunications 
switching, investment-portfoho valuation, e*conmierce order processing, high-demand query caching, 
and fraud detection, to name a few. 

So that our proposed technology does not begin life as a solution in search of a problem, we have 
applied virtual supercomputing to the problem of Financial Risk Management. Specifically, we have 
addressed Portfolio Stress-Testing an area that has a significant need for a cost-effective solution, and for 
which our new technology is well suited. 

However, it must be appreciated from the outset that while we describe a specific appUcation, this 
should in no way imply that the underlying technology is limited to this application. Our sole purpose in 
blending the description of the risk management application with the enabling technology in diis paper 
is to demonstrate the vast cost/ performance benefit. 

RiskScape: An Application of Netcentric Supercomputing to a Business Problem 

It is clear that many commercial problems, such as Financial Risk Management, need the level of 
computational power associated with conventional hardware-based supercomputers. Unfortunately, 
unlike mission-critical projects for military and government operations, the commercial world is highly 
constrained by economic considerations. In the main, businesses cannot justify the expenditure of tens, 
if not hundreds, of millions of doUars on computer hardware that will become obsolete in three to five 
years. This is true even though rapid and accurate risk management can spell the difference between 
business success and catastrophic failure. 

What is needed for commerce is a viable solution that transforms the substantial hardware investment 
already made by a firm into a computational platform capable of supporting the necessary time-critical 
decision process. In this paper we outline our design for such a computer system (termed Netcentric 
Virtual Supercomputer Infrastructure or NVSI), and then continue to describe one commercial application 
(Risks cape'^^'^ that together can provide the level of performance financial institutions require in order to 
manage their risk. 
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Oiir current best estimate of the sustained '"Virtual FLoating-point OPerations per Second" (\^OPS) 
obtainable by our risk-management application supported by our proposed NVSI technology, is 
between 5 and 250 GigaFLOP (billion floating-point operations per second), depending on the number 
of CPUs available to the system. This meets the computational power of a Cray Research T3E, the 
highest-end commercially available supercomputer. 

This translates as the ability to evaluate a portfolio consisting of one million instruments (including 
portfolios of derivatives on both debt and equity instruments) across three million scenarios in under 
one hour. 

It is important to note that the solution we propose is software based, and requires only an incremental 
amount of additional hardware. From an economic and commercial standpoint we estimate that a full 
NVSI/ Risks cape implementation will be between 5 and 10 million dollars. A hardware-based solution 
with comparable performance would cost in excess of 50 to 100 million. Moreover, unlike a hardware 
solution, the NVSI system will not become obsolete with advances in computer technology. 
Indeed, performance only improves with evolution in platform and network capability. 
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Using NVSI for on-demand stress-evaluation of portfolio risk via 

large-scale projection of instrument trajectories over a 
systematic spectrum of path-dependent scenarios, on a landscape 

of unexpected events. 

I. Background 

Computing portfolio risk (PR) is a problem taking center stage in investment risk management. 
Managing risk is not trivial. Banks and other financial institutions are in the business of taking risks to 
generate increased revenues and profits, but they are also required to protect shareholder value and 
prevent catastrophic losses firom occurring. To date, evaluation of PR has mosdy been confined to 
macroscopic aggregate measures such as Value-at-Risk (VaR), which estimates PR as an expected loss 
derived from a weighted sum of volatilities in the individual securities in the portfolio, based on small, 
statistically-derived market moves. Current risk management systems thus do an effective job of 
characterizing the expected loss in linear portfolios operating in normal markets. Indeed, some form of 
fiiU Monte Carlo VaR is the state-of-the-art for both market and credit risk measurement Qorion 97, 
Lawrence 96]. 

Yet, VaR indicates only the maximum expected loss that could occur over some time interval (the 
portfolio holding period) within some confidence level (usually 2a or standard deviations, about 97.5%). 
VaR has nothing to say about discontinuous or extreme (3a4-) market events, such as the Russian 
Sovereign Debt, and Japanese and Emerging Market currency crises. That is, VaR ignores the "fat tails" 
in the distribution of portfolio values within which lurk the "dragons" of risk, the unexpected large 
moves in financial variables that can cause substantial losses, as have been suffered by a number of 
leading financial institutions over the last several years (such as LTCM). And given that the financial 
markets are not normally distributed (they are log-normal), such events happen with considerable - and 
distressing - regularity. To cover the possibility of extreme events, financial institutions implement large 
safety factors, as the absence of specific risk data for the "tails" induces great caution. The result is that 
excess capital is lying dormant, an inefficient solution at best. Moreover, if the VaR method is pushed to 
achieve a greater range of application, severe computational limitations arise. Finally, even setting this 
consideration aside, VaR calculations still do not really address the "What if?" scenario questions. 

In light of these limitations, the mediod stnsS'testing\\2S evolved as a complement to VaR. Also 
known as scenario analysis, this approach attempts to address the weaknesses in VaR by subjectively 
generating scenarios that simulate large- variance events. This enables the handling of nonlinear 
positions, and certainly fills in some of the gaps. But as Jorion points out, current implementations of 
stress-testing are flawed because of the small number of scenarios that can be examined (due to 
computational constraints), thus forcing necessarily subjective choices about which extreme changes to 
evaluate. The method also considers movements in only one, or few variables, and correlations are 
virtually ignored. And most glaring of aU the defects in current stress-analysis of PR is the inability to 
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forecast path-dependent scenarios several time-steps into the future, again due to limits on 
computational resources. 

Even if the appropriate predictive tools existed (such as comprehensive scenario-analysis systems that 
accessed rich historical data combined with "mark-to-future" states), the sheer data problem is 
enormous. Further, the integrity of the data is crucial for assuring the validity and integrity of the risk 
management process results. Real issues of analytical model fidelity and accuracy compound these 
challenges. New financial products are finding their way to market (credit derivatives, synthetic financial 
products, etc.) at an accelerating pace. As die worlds of market risk and credit risk begin to merge, there 
is an accelerating pace in transaction volume growth. It is increasingly clear that the standard approaches 
to managing risk are not keeping pace with the problem domain. Tlie approach to problem solving in 
the risk management marketplace today is gated by problems of computing power; database transaction 
processing throughput, application design, application scalability, and user interface technology 
[Berkowit2 & Wurtz 98]. 

Of course, this is not news. It is well known that the ideal approach would be to simulate the detailed 
price trajectory for all the instruments in a portfolio, over a broad range of path-dependent scenarios, 
using the best (perhaps several) pricing models available (or Monte Carlo simulation, finite-difference 
methods, numerical integration, or tree expansion, where closed-form models do not exist), all calibrated 
by accumulations of historical data to provide correlation coefficients, scaling factors, and transition- 
probabilities for variations in financial parameters. The difficulty has been that such an approach 
requires an enormous amount of computing power (on the order of leading-edge supercomputers), at a 
cost that is daimting to even the most resource-rich investment banks. So, we setde for VaR, and a lot of 
dieoretical modeling and projection. But, as pointed out by Wurtz [98], risk analysis must be essentially 
data-driven, not a theory-driven exercise performed in a data vacuum. 

This paper outlines a new technology that allows, for the first time, a practical solution to the problem 
of calculating future risk for large-scale portfolios. We present a novel computing architecture - 

essentially a software-based "virtual supercomputer" - that supports on-demand access to projected 
prices over portfolios of 0(1 M) securities, for a fiiU range of path-dependent scenarios that entail large 
(3a+) moves in fiinancial variables. Termed NVSI (Netcentric Virtual Supercomputer Infrastructure), 
the system is a suite of highly-optimized system kernels and user applications, designed to emxilate the 
key aspects of supercomputer architecture (tools, techniques, and algorithms), running on off-the-shelf 
workstation and network hardware, for about 1/1 00"" the overall cost of a dedicated supercomputing 
system. 

How does it work? 

In brief, by continuous off-line (background) computing, the RiskScape/^i>NSi constructs a daily updated 
landscape of projected portfolio values for a broad range of scenarios, along with associated scenario 
probabilities. This multidimensional state-space (hyperspace) can then be queried to yield near-real-time 
answers to questions such as: which scenarios (if any) could result in catastrophic loss to my portfolio in 
a week, ninety days, and six months out, and with what likelihood? 

By using supercomputing techniques emulated in software, model optimization, massive non-swappable 
RAM, and distributed processing over commercial networks of existing workstations, the NVSI first 
populates a hyperspace of up to 10 billion nodes with state-vectors that contain the pricing information 
(and other moments) for the entire range of instruments in a portfolio of up to 10 iniUion securities. 
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Once the hyperspace is fully populated with pricing vectors, the portfolio can then be marked to any 
future state (scenario) desired, simply by looking up (or navigating to) the address of that state and 
applying the pricing vectors to each instrument. The computational overhead of this second phase is 
minimal, as the vectors have been pre-computed off-line. Clearly, the combinatoric nature of the 
problem requires that the solution space be properly constrained. To do this correcdy, the optimal 
granularity and distribution of the state-vectors in the hyperspace must be determined to ensure that the 
problem domain is fully bracketed. 

If the set of state-vectors is chosen appropriately, the entire landscape of scenarios can be searched for 
the extreme events. In fact, the application can be programmed to find the boundary conditions for 
catastrophic loss. In other words, the NVSI system can actively search future state-space to determine 
what combination of market and/ or credit conditions would cause the portfolio (or institution) to fail. 
In addition, the probability of these conditions obtaining, and the Hkely amount of time reqviired for the 
conditions to obtain as weU as the transition states through which the world must pass to reach the final 
state, can be determined. 

Such information would allow strategic management to see problematic conditions in advance, and take 
appropriate action. Problematic financial and non-financial holdings can be analyzed, understood and 
imwound before potential financial meltdowns occur. 

Why hasn't this been done before? 

The answer is, because the computing demand is enormous, the price/performance ratio for available 
technology has just recently come within practical range, and no one has heretofore applied a blend of 
software solutions using virtual-supercomputer architecture, off-the-shelf network hardware, background 
computing, a spectrum of numerical optimization techniques, and domain-specific "tricks" to make the 
computational problems more tractable. 

Large-scale computing problems have been aroxmd since the Manhattan Project, and indeed, Los 
Alamos was the necessity that mothered the invention of DBM (colloquially named "Dah Big 
Machine"). Until recently, development of DBMs, now known as supercomputers, has been driven by 
military and civilian government needs, and hence, contracts. Typical application domains have been 
weather forecasting, code breaking, signal and image processing, intelligence evaluation, aerospace 
engineering, and nuclear weapons design. The resulting machines were designed and built with cost as 
no object, and their prices reflect that history. Spending $50M-$150M on a dedicated number-cruncher 
is an expense difficult to justify for an investment bank, especially when that glistening rocket ship will 
become a burdensome dinosaxir in about three years. 

The alternative has been to scale-up existing software that calculates trajectories for individual 
instruments, such as options and other derivatives, using pricing models like Black-Scholes. These 
programs work perfectiy well, and provide a flexible platform for future improvements, except that they 
provide single-instrument answers for one input vector in time-frames on the order of a minute, or at 
best, several seconds. This is fine for the trader negotiating a position, but when this performance is 
expanded to include a family of trajectories arising from a wide range of scenarios, and then further 
multiplied by up to a million instruments in a large global portfolio, even brave souls pale. Waiting a 
hundred days or more to get an answer somewhat moots the requirement for daily update and response. 
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The application of the NVSI techniques, in coordination with the recent convergence of several other 
factors, enables the construction of an affordable virtual-supercomputing system that can usefully meet 
the need for large-scale, near-real-time portfolio risk analysis. These recent factors are: 

• The increased processing speed of affordable workstations, 

• Improvements in net-centric computing software, 

• The availability of 1GB DRAM chips, thus allowing 128GB (40-bit address) or more of RAM 

in a single box, and 

• Increased demand for a solution due to the increased rate of large bank failures (attributable to a 

lack of sufficient stress-testing). 

In essence, no one has demonstrated a software-emulated supercomputer, because the performance 
would be too slow for the real-time government applications listed above, and providing a one-off 
solution for near-real-time commercial needs would require massive construction of custom software at 
a cost nearly as burdensome as bujring a big machine. 

Instead, we have determined that it is possible to build a suite of leading-edge system programs and 
software applications, running on networks of off-the-shelf workstations, that gives dedicated, near- 
supercomputer performance without requiring much (or eventually, any) specialized hardware. Such an 
approach can evolve with improvements in hardware and software technology. When presented as a 
retail solution to potential client financial institutions at a reasonable cost (1/1 00'^ that of a dedicated 
supercomputer), the NVSI system becomes a viable product. 

II. What is new here? 

What is not new is netcentric computing of large-scale problems.^ The innovation is in building 
integrated, highly-optimized software that emulates the kit of hardware supercomputing tools and 
techniques, to create a hardware-independent "virtual supercomputer", optimized to solve a wide class 
of problems that require large-scale evaluation of independent state-functions in an unbounded 
hyperspace of multidimensional inputs and outputs. Indeed, our architecture could be used to solve a 
range of similar problems in other domains, such as credit risk, query-caching, and transaction- 
processing for global compliance management on the internet, which are decomposable into separate 
processes of background pre-computation and real-time (demand-query) navigation of a large state- 
space. NVSI would not be suitable for a domain that required the system to "keep up" with a data 
stream from the "world" arriving in real-time, such as cryptographic analysis. 

The following innovations and/ or breakthroughs in RiskScape/l^VSl are detailed in subsequent sections: 

Innovadons in Computational design. 

• Flexible-structure, fully-compacted, variable-length data words, optimizable for specifiable 

problem domains; 

• Flexible connectivity to allow optimal hyperspatial topology (graphs, trees, hypercubes) relative 

to a spectrum of specifiable problem domains; 

• Highly-optimized numerical techniques for moderate-accuracy computation; 

* This has also been termed "Metacomputmg" by the NCSA (National Center for Supercomputing Applications) and its 
affiliated institutions, and is being actively pursued as an alternative to single-box supercomputing, using wide-area 
networks of high-end mainframes. 
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• Software emulation of supercomputing structures and processes (such as simple, efficient data 

representation and handling; inherent vector representation, limited data/computation modes, 
interleaved memory, table lookup, induced pointers, and distributed & parallelized 
computation [Aho etall^^ HiUis 85]), thus providing cost-effective scaling and enhancement; 

• Separation of processes into pre-computation (populating the state-space) and navigation 

(searching the resulting hyperspace of results); 

• Second-order "daemon" processing to interpolate with finer granularity (mesh enhancement) 

around selected nodes in state-space. 

Innovations in Computational Rislc Analysis: 

• Simulating only extreme (3a+) moves in financial variables; 

• Optimized model representations and pre-computed parametric fiinction spaces; 

• Using virtual or proxy instruments to represent whole classes of securities - such as cash, options 

and currency swaps - that share the same basis (underlying asset or index); 

• Statistical sampling to create a much smaller representative portfolio. 



III. Overview of the NVSI Architecture 

The functional block-diagram of the NVSI illustrates the essential aspects of the design: 
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Figure 1: Functional block diagram of the NVSI system 
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The core of the NVSI system is a suite of powerful state-of-the-art applications built around a large (128GB) 
dedicated (non-swappable) MassRAM, that in later versions will be implemented via shared RAM from 
network resources (or even hard- storage-based virtual memory). The particular structure of the data words 
and hyperspatial connectivity are implemented from domain-specific parameters set by the client user via the 
Risks cape application. The Instantiation Manager then configures the design by creating a set of tables (in CPU 
RAM) containing metadata, such as data-word definitions, tree structures, pointer parameters, network 
sharing, and problem-domain specifications & indexes. The state-vectors ^yperspace nodes) are then 
computed and stored in MassRAM by the Population Manager^ which calculates all function (model) values for 
each scenario. The Population Manager works continually, in background, distributing the burden across the 
shared Network CPUs. On-demand mark & search (via queries from BJskScape) of the state-space is then 
handled by the Navigation Manager, in concert with the Interpolation Manager (and its associated Extended 
MassRAM - EMK- for finer-grained exploration of selected node-neighborhoods). More detail is presented 
in subsequent sections. 

IV. Data Stnictmes 

For each entity in the collection (each instrument-type in a portfolio, for risk applications), the NVSI 
constructs a path-dependent tree (a rooted, ordered, unidirected graph), also more generally termed a 
metastructure or scenario-tree. The branches (edges) represent variations Al^. in input-parameter 4 with 
fanout K. Every node (vertex) is a state-vector s containing the value (output) of one or more model 
vectors Ffor the given instrument (allowing for multiple models to value the same instrument), and the 
probability P(s) associated with that node (derived firom the conditional probability of the particular 
input parameter variation that led to the current state), for each timestep 4 in a sequence of chosen 
intervals. An example tree, for fanout K = 9, is shown below: 

P(Sn) = P(S2) • P(Sn|S2), 




P(Su|S2) = P(AIJ IJ 



S| = <j,k,z, P(si), n,V,V') 



to ti t2 • • • 

Figure 2- An example tree-fragment illustrating how nodes branches are connected, 
how states are represented^ and how probabilities are derived. 
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For each node i (a point in the state space, H) there is an associated state Sj, defined by a 7-tuple state- 
vector: 

where: 

• 1 < i < Q, Q = cardinality of H = total number of nodes (state-vectors); 

• j is the number of the particular branch in a fanout of K (required to derive the back pointer), 

i<y<K. 

• ^ is the number of the time-step for interval <k<x, x = chosen number of time-steps 

= depth of the tree. Note that the intervals may be non-uniform; 

• ^ is an array of logical (Boolean) flags indicating various computation conditions, such as mode 

(populate or navigate), update status, interpolation-flag, constraint-satisfaction flag, etc. 

• P{s^ is the probability of state 5, occurring, calculated in the standard way from the joint 

probability of all ancestor nodes, which is in turn derived from the recursive product of 
marginal and conditional probabilities: 

P(s^ = P(Si n n ''Si . . .), where the ' operator denotes a parent state (node 
on the tree), 

= P(^Si) • P(8i I ^8i), VQS) = PC^SD • PCs, I ^'8i) , 0 <, P(S) < 1. 

For RiskScape applications, the conditional probabilities are just the transition 
probabilities for changes in financial variables. In general, although the states (output 
function values) are not usually path-dependent (except for some exotic 
instruments), the transition probabilities are conditional. Thus, the probability of a 
change in volatility a, for example, is dependent on the baseline: 
P(s.rs^ = P(Aa|a); 

• is the number of functions or variables included in V, and thus essentially determines the data- 
word length, where: 

• F= [v^, • • ^n] is the value-functional, a set of one or more v, which are domain-specific function 

equations, input parameters, output variables, or parameter-pointers, derived from pricing 
models (or any state-independent equation from the problem domain), where a given vis 
generally defined by: 

v=J{t,I{i)), where: 

• J{t) = [a, b^c^ is an array (vector) of m input variables that are subject to change (A/J at each 

time-step, thus generating the various K branches for each node; 

• F' is the dual of V, containing the next-day's "aged" set of values, computed by the Population 
Manager 2iltQ,i it computes F(see section VI). 

A particular state is defined by computing scalar values for aU the variables (or functions) v defined in the 
state-vector, for the particular node in the tree. 
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Note that K is just the total number of variations over / The way that K is mapped onto /, Le.^ how the 
available granularity in variations is distributed across the input variables, is defined by the client-user. 

Three key features of the data structure are: 

• Pointers are not stored, but induced. That is, successor nodes are indexed by an algorithm 
(modified from Knuth 68) that computes addresses for child nodes in balanced trees as 
adjacent words in memory. For parent node /, the child nodes on its subtree are numbered by: 

K • (/-I) + (1+/), / > 0, 1 <y < K . 

This enables fast calculation of pointers, saves an enormous amoimt of storage, and wastes 
virtually no memory space. And because the trees are fixed until the state-space is recreated 
for a new domain and a new set of scenarios, there is very litde shuffling or garbage collection 
required. 

• The actual data values are stored as bit-string integers, even for floating-point numbers, which 
are stored as fixed-point integer pairs (mantissa & exponent) of greatly shortened length to 
handle just as much accuracy as needed (typically 10 bits - for 1 part in 1000 - instead of the 32 
that is standard in desktop ALUs). These short integers are passed to the system ALU for 
simple integer arithmetic, thus dramatically decreasing the computational demand. 

• Exponentials, logarithms and roots are retrieved via stored-table lookup, again at just the 
accuracy required. 

The total number of nodes (or states), Q, in the hyperspace H is constrained by the total amount (in 
bytes) of dedicated MassRAM (denoted by Mem) available, and A, the characteristic length (in bytes) of 
the data words. Thus, 

Q(jH) = total number of nodes (states) in the hyperspace 
^Mem /A . 

As shown later, a typical instantiation for RiskScape yields Q ~ 10^ states. 

A scenario is the path of events, or conditional changes, that leads to a given state. Thus, each node 
represents the result of one scenario. The entire set of scenarios yields a scenario-tree, which is exacdy 
one metastructure. Note that, typically, different metastructures are defined on the same landscape of 
events, and thus share the same set of scenarios. The total nvimber of separate trees is therefore 
constrained by both Q(fJ) and the chosen fanout K (derived firom the desired granularity in input 
variation) and x (the total number of time-steps desired in the analysis). 

For balanced trees, the number of nodes M in a tree is given by the sum of a geometric progression of 
base K: 

M - [y6^ ''^^ - !]/( K-1) = total number of scenarios 
= k\ for K » 1 . 
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The total number of available scenario-trees (metastructures), N, is therefore: 
N=Q/M. 

V. RiskScape Structures & Design Factors 

The NVSI, along with the RiskScape Intelligent Financial Risk Application, was originally designed to 
optimize the computation of future portfolio values over a range of scenarios designed to uncover 
"dragons", the potential catastrophes that can result ficom less likely events (changes in relevant financial 
variables) that lie in the "fat tail" of quasi-lognormal (kurtotic) distributions presumed to underlie most 
financial variations. Typically, large global portfolios are composed of 100,000 to lOM instruments 
(securities), and include a wide range of types: stocks (equities), bonds, futures, currencies, swaps, 
convertible debt, interest-rate instruments, options (on aU of these), and other exotics. Various models 
of these instruments exist, and the output variables include price and the various moments (the 
"greeks") of the model function. Typical input variables, which are changed systematically in a 
simulation to construct a scenario tree, include (price of the underlying asset), the asset volatility a, 
interest rate and interest-rate volatility a, . Some models have-closed- form solutions that are relatively 
straight- forward to evaluate (such as Black-Scholes). Others, such as those for interest-rate options, 
require some form of stochastic simulation (such as Monte Carlo [Berkowitz 98]), finite-difference 
methods, or trinomial tree expansions for determining term structure (such as Black, Derman, Toy), and 
are thus far more cumbersome to compute. 

Probability values 

In the prototype version designed for use in financial risk analysis, the data structure has elements that 
are key for future-risk computation. For portfolio evaluation, the conditional probabilities P(A/^| /J 
correspond to the actuarial transition-probabilities derived from tables of historically-calibrated 
movements in financial variables, or extracted from relevant time-series of changes in those variables. 
For risk analysis, the myfe-^^i/j/^^ probabilities are also important, so the RiskScape interface allows the 
user to choose either, or both. If both are specified, one of the alternative probabilities is stored in one 
of the n variables 

With a K (fanout) of only 12-16, and four input variables, for example, there is not enough granularity to 
allow for joint transitions (such as Aa = 4 & A/),^ = -0.5). Thus, in the prototype, branches typically 
denote orthogonal moves, that is, changes in one input variable at a time. However, implicit 
nonorthogonality can be approximated by using historically-derived correlations between financial 
variables. This approach relies on the fact that, with large moves, correlations between financial variables 
become tighter. Alternatively, K can be increased to allow for joint transitions, consonant with available 
resovirces. 

To make the computation tractable for large portfolios (1M+ instruments), and congruent with the 
theme of future-risk analysis, which is to look at possible catastrophes ("dragons") arising from large 
moves in the input variables, only changes of A/^ > 3a are typically evaluated. Although available tables 
of transition probabilities do not always contain such data, the required probabilities can be easily 
extrapolated, as the distributions (or at least the first moments) are known. 
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Instrument Proxies 

Even with 128GB RAM and modest tree size (k = 12 and x = 6), for large heterogeneous portfolios an 
unaugmented NVSI can support no more than --^ 100-1000 metastructures (separate scenario trees), 
depending on the word length A (see section VII), To compensate, one of the innovations in the 
architecture is therefore to use "virtual" or proxy instruments: state-vectors with ^composed of generic 
model functions for all the securities based upon the same underlying asset (or base index). The actual 
instrument prices can then be calculated during navigation, by evaluating the corresponding function 
and converting back using standard scaling and correlation coefficients. When a proxy state is computed 
(during the hyperspace-population phase), the various prices and moments of a given model are 
calculated over a range of values, thus creating a parametric space that brackets the range of possible 
magnitudes for the portfolio instruments. During the navigation phase, the actual instrument prices (and 
other moments) are then calculated via extraction of values from the pre-computed parametric-space, 
using the state-dependent input parameters stored in the proxy state-vector. 

For example, suppose the actual instrument is a standard European option, defined by free parameters 
iC(= strike-price), S (=/>J, r (= i), 5 (= divyield), T(= time-to-expiration), and cr (= volatility). If the 
model of choice is Black-Scholes, the option price can be obtained from a "normalized" space defined 
by only three parameters: {K/S, {r- 5)T, cT^T } {s^^ Addendum A1). During the population phase, the 
parametric pricing space is pre-computed (creating a "parametric-hypercube") and mapped to the proxy, 
and the values of the input vector /are stored in s for each node in the scenario tree. Then, dviring 
navigation, the price p of the actual portfolio option instrument is calculated (marked to scenario) simply 
by extracting a virtual price pp^y from the associated parametric-hypercube using the input values (K, 
r, T, d) and then transforming the result using known scaling (a) and correlation (P) factors. That is. 

Recall that each different proxy represents instruments with different underlying assets, and each proxy 
is evaluated over one scenario-tree. Thus, the number N of available metastructures determines the 
number of available proxies onto which the portfolio can be mapped. 

There can be many different classes, or types, of instruments in a portfolio [Hull 93], including: 

1. Equities & their derivatives (d) 

2. Debt Instruments & d 

3. Swaps & d 

4. Currencies & d 

5. Collateralized Mortgage-Backed Obligations & d 

6. Exotics (such as Interest Rate, Barrier, Lookback and Knockout options). 

Some types are much more cumbersome to compute, as indicated above. For example, some of the 
exotics require complex use of input parameters, such as interest-rate derivatives that use a path- 
dependent interest-rate curve (or actually, an interest-rate vector). Models such as Heath-J arrow-Morton 
and Black-Derman-Toy are valued vidth state-dependent trees, which the ISTVSI architecture already 
supports. 

The proxy state-vector must contain models or parametric references for as many types of instruments 
as there are in the portfolio. For example, a proxy might contain variables for cash, fiitures, standard 
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options (all Type 1), sovereign debt (Type 2), and interest-rate options (Type 6). All instruments of Type 
1, 2, & 6 in the portfolio that axe also based upon the same underlying asset or base index (such as the 
J'eW^ 500), are linked to this proxy. Thus, there need to be as many proxies as there are underlying 
indexes in the portfolio (typically in the hundreds). 

Data Word Structure 

The structure and size of a data word (the bit-wise representation of a state-vector) is specified by the 
Data-word Definition Table (see MetaTables), which is constructed by the Instantiation Manager 
according to specifications received from the client-user. For each proxy, the Population and Navigation 
Managers carry this table around, so to speak, as a part of their operation. 

Data-word structure is flexible. The default is as follows: 

The back-pointer, j, indexes the branch (of k) from the parent node that a state occupies. The 
prototype allows for a maximum fanout of K = 64, so the length of y'in bits is: 

len y = 6 bits. 

The prototype allows for up to T = 16 time steps, so 

len /fe = 4 bits. 

The flag-vector, 2, is one byte (to handle up to 8 conditions): 
len ^ = 8 bits. 

Probabilities must reflect likelihoods derived from large-move events, and need be no more accurate 
than 1 part in 1000 (three significant digits). Thus, 

lenP(s) = 10 bits. 

The number n of variables in the value-vector Fis unbounded, but the prototype allows for 256. Thus, 
len « = 8 bits. 

Strictly, n is redundant, because the data-word-definition table (created by the Instantiation Manager) 
specifies the number of variables in the state-vector. Yet, placing n in the state-vector promotes data 
integrity, and its storage penalty is small. 

The first part of a state-vector (everything but F), denoted by (5, thus has a length: 
len {s = 36 bits. 

Finally, the variables in Fcan represent function outputs, parameter ratios, input variables, statistical- 
distribution points, partial-derivative values, or parameter-reference pointers. In general, the magnitude 
ranges are known and specified in the data-word-definition. Thus, the only part stored in the state- 
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vector is the integer mantissa, again typically to three significant digits for dragon-hunting. In those cases 
where the range is arbitrary, or infinite, an integer exponent to accommodate 10"^^ (5 bits) is also stored. 
The default, then, is: 

len v= 10 bits or 15 bits . 

For the complex proxy described above, which we shall use as a conservative benchmark, the state- 
vector would contain one function variable for a cash index, one variable for each of several (say four) 
futures, four variables from the input vector /= \p^^ a, i^, a^J for the Type 1 options, perhaps an 
additional four (different) input variables for the Type 2 options, and an interest-rate vector of, say, eight 
variables, that represents the path-dependent interest-rate curve for the Type 6 options. We can also 
imagine that some of the exotics have models so complex that the parametric-hypercube yields a family 
of curves; or perhaps an instrument-type is so new that its model has not yet been parameteri2ed. In 
either case, variables (say ten) representing curve-selection values, or five slope-intercept pairs for 
linearized segments of the curves, need to be stored. 

There is also one variable required to store the alternate probability measure. 
Then, 

n s number of variables = 1+ 4 + 4 + 4+ 8 + 10 {or 5^2) + 1 = 32 . 
If most of the price-related variables (cash, futures, curve-segments) require stored exponents, then 

len F = (15 • 15) + (17 • 10) = 395 bits. 
Accounting for the length of F' (the time-dual of F), this yields 

len 5 = len (s + 2*len F = 36 + 2*395 = 826 bits. 
The actual word-length, in bytes, for our benchmark proxy state-vector, is then 

A = rien5/81= 104 bytes. 
This wiU be used to calculate relevant performance parameters in section VII. 

MetaTables 

The user specifies the nature of the problem domain, and all relevant data, via the RiskScape Interface, 
using a Scenario Description Language {SDL), The Instantiation Manager xh.Q.n creates and configures a set 
of tables to implement the specifications. Typical tables include: 

• Data-word Definition (including field structure & bit-masks for storage and field extraction) 

• Hyperspace Definition (topology, K, x ) 

• Node Index (as an adjunct to pointer induction) 

• Proxy Definition (types, model variables, parameters) 

• Scenario Definition (input vectors /, the mapping of /to K ) 
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• Input Variable Data (updated & adjusted values for all possible such as a, i^, etc) 

• Probability Transition Data (derived from commercial tables or extracted from time-series) 

• Arithmetic Lookup (exponentials, logarithms, roots) 

• Conversion Factors (scaling, coefficient, parameter-estimation data) 

• Portfolio Data (coded description of all instruments and their specifications, including proxy 

links, and whether or not the instrument is to be included in the reference sample). 

• Pricing-Model Parametric Data (pre-computed hypercubes of pricing-model values and 

moments) 

Some of the tables are quite large, such as the Portfolio Data (1 million records or more) and the Node 
Index (a billion records). All are stored on hard disk, but some are also resident in CPU RAM (not 
MassRAM, which holds only the hyperspace H), such as the Data-word Definition, Proxy Definition, 
Probability Transition, Input Variable and Scenario Definition tables. 



VI. The Infrastnictural-Technology Suite 

The core NVSI modules are: 

• Risks cape Application: the user specifies (using SDL) the nature of the problem domain, the type 
of structures involved, the fanout K, the time-depth x, the mapping of K to the input-vector I, 
the portfolio to evaluate, the scenarios desired, the event and probability thresholds, and so 
forth. 

• Instantiation Manager, configures all the definition tables and MassRAM to handle the domain 
specifications from RiskScape, 

• Population Manager, computes all state-vectors in background (off-line) processing, according to 
the rules and tables specified by the Instantiation Manager One of the key optimizing techniques 
used is to simulate interleaved memory by having the Population Manage first compute all the 
values of V, to have them available to the Navigation Manager (see below) as soon as possible, and 
then it computes the next updated (dual) set of values F' - the same variables, same value of /- 
but updated for the next day's data, as the portfolio is "aged''. The Navigation Manager then 
selects the second set of values (if the ready-flag has been set by the Population Manager), so that 
portfolio update is performed synchronously, over the entire hyperspace. 

• Navigation Manager invokes queries from the client-user (in this example via the RiskScape 
Application) to search the hyperspace H, evaluate H over each scenario, mark the nodes that 
either meet the event & probability thresholds, or that warrant fiirther exploration (via the 
Interpolation Manager), It is the Navigation Manager that responds in near-real- time to user requests, 
and "walks the landscape" of the hyperspace to hunt for dragons. 

• Interpolation Manager in concert with the Navigation and Population Managers, it applies msh 
enhancement to selected nodes. That is, it creates an expanded tree-fragment with a finer 
granularity in the neighborhood of the node, to yield more accurate values that may lie 
"between" certain states. This enhancement may entail expanding the j;^)^/^'^/ granularity, via a 
larger K with finer gradations in A/, or using correlations to simulate joint transitions. 
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Enhancement may also be temporal^ expanding t locally by creating a tree-fragment with finer- 
grained time-steps (and this may also use correlations to interpolate variations in input variables). 

VII . Performance Parameters & Evaluation 

In this section, boundary conditions and typical mid-range values for spatial (storage) requirements and 
temporal performance are derived. 

Spatial parameters 

Recall diat: 

A = characteristic data-word length, and 

Q(jFJ) = number of avaikble states in H s Mem / A , Mem = available memory (in bytes) . 

Thus, for the prototype MassRAM of size 128GB (= 137 X 10^ bytes), and a typical A (for our 
benchmark proxy) of 104 bytes: 

Q = (137 X 10' / 104) = 10' = 1 billion nodes (or states) available in the hyperspace. 

Note that with a A 50 bytes (for very simple Type 1-only proxies) to 200 bytes (for very complex 
proxies representing instrument Types 2, 5, & 6), the result is still about the same. 

For the same size scenario tree as used earlier, with K = 12 (allowing three gradations, or moves, for 
each of four input variable) and T = 6 (allowing for six time-steps, such as 1 day, 3 days, 1 week, 
/vaR ~ 20 days, 3 months, 6 months), then 

M - tree size =12^ = 3x10^ = 3 million nodes in the scenario tree. 

Thus, 

N = number of scenario-trees = Q / M = lOV (3 x 10^) = 300 = number of proxies allowed. 

The number of proxies available to map the portfolio onto is therefore about 300, for K = 12 and 1 = 6. 
Can a large global portfoUo be mapped to only 300 proxies, that is, only 300 base indexes? Absolutely. 
There are only six types of proxies, and liaking these to 300 underlying assets/indexes would cover most 
of the developed world. 

To bound this result, consider a scenario tree with a larger K of 16: 

M = 16^ = 1.7 X 10^ => N ^ 60, still large enough to represent a fairly diverse portfoUo. 

Suppose diat we don't need six time steps, but only four (1 week, Z^^, 1 month, 3 months): 
M= 16' = 65,536 => N - 15,000. 
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Thus, reducing the number of time-steps significantly increases the number of proxies that can be 
created. If we wish to increase the granularity of input variations, perhaps allowing for joint transitions 
(like Aa & ApJ, then K can be increased to 32 or 64. If a client-user simply wanted to value a portfolio at 
one timestep (x = 1) ty^^ , and with very high precision in the scenario mesh (k = 64), then 

N= 10V64 =16mimon. 

At this level, a sizable portfolio can be evaluated for future risk direcdy, without the need for proxies. 

The realistic spectrum for iVis thus: 

1 = -N3^6 < (30 = N,^s) < (Ni5.6 60) < (Ar^,x = N,^.^ - 300) < {N,,^, - 15,000) < (N,,, = 16 miUion). 

Since many portfolios require less than 100 base indexes, then with a typical N of ~ 300 available for 
our benchmark tree of K = 12 and T = 6, the effective fanout K can be increased (thus increasing the 
input-variation granularity), by indexing an array of proxies to one instrument-class, each identical 
except that the scenario-trees use different A/increments. For example, one tree could assign each 
branch to variations of (0, ± 3a), another tree have branches for ± 8a, and another for ± 12a. 

As we show in the temporal performance calculations, the virtual effective throughput for navigation is 
high enough that it will be possible to implement MassRAM not only as a shared network resource, but 
eventually, as virtual memory using hard sequential storage. That is, because the hyperspace trees are 
fixed and space-filliug for a given problem domain, the RAM transaction volume is low, with littie 
random access. Thus, by using JINF^ technology (for example) over shared network resources, NVSI 
v2.0 can use FIFO paging from optimized disk storage, perfortning a look-ahead page-fetch in 16GB (or 
larger) segments, while still not slowing the Navigation Manager. Under such an operating system, 
is virtually unbounded, and we could realistically process 10^^ nodes (which stiU only reqioires a 40-bit 
address) or more. 

Temporal parameters 

Performance of the NVSI is partitioned into the two primary phases: the time 7^p to populate (compute 
and fill all the state-vectors in) the state-space , and the time T^^^ to navigate the space (evaluate the 
domain collection - such as a financial portfolio - at each scenario, and flag selected nodes for states 
that meet the criteria). 

The key to the NVSI idea is that the apparent (effective) throughput in response to a user query is 
driven by the navigation-time, as the population-time reflects background (off-line) computing. 

As a reference, performance parameters are derived for the RiskScape problem domain, using the 
benchmark proxy already described. 
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POPULATE 

The population time has four principle components: 

1 . A one-time setup by the Instantiation Manager, 

2. A one-time pre-computation of the parametric-hypercubes for all relevant pricing models; 

3. A recurring (for each daily update, as the portfolio is aged) recalculation of all values in Ffor 

every node in H; and 

4. Background 1/ O and network-sharing overhead, which varies according to the amount of real- 

world data capture, and the si2e of H. 

It is the third component that is most characteristic of T^p. For our benchmark proxy, the calculation 
requirements for each of the {s values is: 

/ 3 integer operations (iNT) 
k 1 INT 
^ 8 INT 

P{s): 1 lookup 1 INT) & one floating-point multiply (~ 1 FLOP, FLoating-point OPeration) 
m 1 lookup & store ^ 1 INT 

For the calculation times are highly varied. The cash variable requires 1 INT, the futures each require 
^ 3 FLOP + 1 lookup 1 INT), and the alternate probability ^ 1 INT + 1 FLOP. Each of the input 
parameters involves 1 FLOP to calculate, and similarly for the interest-rate values. The curve-selectors 
and/ or slope-intercept pairs require 2 FLOP. Each of the 32 variables values takes 1 INT to store in the 
state-vector. Thus, die total (recurring) time to populate one state-vector for our benchmark is: 

1(1 INT) + 4(3 FLOP + 1 INT) + 4(1 flop) + 4(1 FLOP) + 8(1 FLOP) + 
10(2 FLOP) + 1(1 INT + 1 FLOP) + 32 (1 INT) 

= 38 INT (~ 4 FLOP) + 49 FLOP = 53 FLOP. 

The total flops, /^p(3), to populate all nodes is then: 

Fp^p(3) = Q * 53 FLOP = 53 X 10' flop = 53Gflop. 

We take as a baseline-reference CPU a typical mid-range, stand-alone workstation, with a computational 
throughput of R = 10^ FLOPS (FLoating-point Operations Per Second) = 10 Mflops. Then, 

7^p(3) = (53 X 10' FLOP) / (10^ flop / sec) = 5300 seconds - 1.5 hours. 

To calculate the pre-computation time for the various pricing-model parametric-hypercubes, we require 
the characteristic computation time for typical models, and a choice of granularity in the parameter 
space. For Black-Scholes options, computing one price and the associated moments (A, F, vega, 0, p, x) 
takes ~ 70 FLOP; one barrier or lookback option & moments: ^ 1000 FLOP, and one exotic derivative 
(such as Black-Derman-Toy or Heath-Jarrow-Morton): ^ 2,000 - 20,000 FLOP [Ferrentino 99]. 
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One of each tj^e is needed for the benchmark proxy. Assuming three parameters for each (actually, six 
is more appropriate for exotic derivatives, but the hypercube becomes enormous, so each point is made 
a vector that embeds a family of curves), and assuming a granularity of 100 increments per parameter 
(dimension), then 

the number of points for each parametric-hypercube = 10^ (1 tnilhon). 

The total computation to create a Type 1 hypercube is thus 70 X 10^ FLOP, and for a Type 2 hypercube, 
1000 X 10^ FLOP, and a Type 6, 20000 x 10^ FLOP . 

Thus, the amount F^^{2) of parametric-hypercube precalculation is dominated by the exotics, 

/^p(2) = FLOPS (Type 6) + FLOPs (Type 2) 2.1 x lO'*' FLOP = 21GFLOP 
For die lOMFLOP CPU, tiiis yields 

7^p(2) = 2100 seconds ~ Vz hour. 

If we estimate the Instantiation time 7^p(l) at about the same (to fiU aU the tables, some with millions of 
entries), ^ Vz hour, and that system overhead (category 4) doubles all the other times, then we have: 

7;p(once) = 2 • [ T^^{\) + 7-^^(2) + 7;p(3)] = 5 hours, and 

7^p(recurring) = 2 • 7^p(3) - 3 hours. 

That is, the entire hyperspace H of states can be repopulated daily, allowing for real-time aging of the 
portfolio. 

NAVIGATE 

Navigation of H is the process of evaluating (pricing folly) the entire portfolio over the entire scenario- 
tree, and flagging the nodes (states) that satisfy the client-user query (on both probabilities and values). 
The Navigation time, 7]^^^, is thus dominated by the time required to price the portfolio, which entails 
pricing each instrument, and sunaming over the entire collection. Pricing an instrument involves 
calculating the victual price (and all moments) from the proxy, and transforming with known multiplier 
coefficients (scaling, a and correlation, P). Doing this in turn requires taking each value stored in V 
(usually a set of input variables /), combining it with the free parameters of the instrument (such as X & 
T for a Black-Scholes option), and calculating the relevant selection parameters, used to either access a 
parametric-hypercube, or for newer instruments or open-form models, to calculate the final function 
output directiy. 

The process of evaluating the portfolio price for one scenario, that is, for one node (state) in the 
scenario-tree, we term mark-to-scenario (m-t-s). Evaluation of the portfolio over aU scenarios, we term 
fnark'to-landscape (m-t-1). 



CONFIDENTIAL & PROPRIETARY. Copyright © 1999-2000 Veriscape, Inc. All Rights Reserved. 



CONFIDENTIAL & PROPRIETARY. Copyright © 1999-2000 Veriscape, Inc. All Rights Reserved, print 04/08/04 24/28 



Flagging a node involves only two compares and a bit-flip in so 7^^^ = T^.^.!, which is given by: 

where: T^^^^ is the characteristic rime to price one actual instrument (and all of its moments), 
(0 is the total number of instruments in the portfolio, and 

is the time to add aU of the instrument prices to yield a total value for one state. 

Note that only one instrument-type will be calculated for any given proxy-vector (although the proxy 
may contain data to price all the various types it represents). 

One of the essential aspects of the NVSI architecture is that, because the function-spaces are pre- 
computed (during the populate phase), the time to extract the price-values is nearly independent of 
model complexity. Instead, evaluation time is dependent on the number of parameters required to 
calculate for extraction of the proxy price. 

Of course, the cash and futures values are stored in s ditectiy, so pricing their corresponding 
instruments is trivial. Thus, a more realistic estimate is obtained by calculating the Tp^^e ^le options, 
all of which are comparable to each other. 

For the Type 1 option, the three parameters for extracting the price (and any other moment) are: (JC/ 
(r- S)T, cr^T}. Each parameter requires about 1-2 FLOP to calculate, for a triplet total of 4 FLOP. All 
of the moments are obtained with the same parameters at the same time from the parametric hypercube, 
so the time to obtain the entire proxy-price is just 4 FLOP. To value the instrument-price, each proxy- 
price moment is then mviltiplied by a combined factor y = a«P, which adds 1 FLOP to the process. 

Thus, to price all eight moments (p, and 7 greeks, for advanced models) of an actual option instrument, 
the amount of computation required (in FLOPs), Pp^ce, is given by: 

J^rice = (4 + 8) FLOP = 12 FLOP. 

For a IM (©=10^ instrument portfolio, the FLOPs to mark the entire portfolio to all scenarios for our 
benchmark is then: 

= [12 FLOP • 10^ + 10^ • (3 X 10^ 39 X 10^^ flop = 39 Tflop. 

The computation required to fuUy risk-evaluate a 1 million instrument portfolio over an entire scenario 
tree (k = 12, T = 6) is: 39 Tflop. 

On the reference CPU, the time required is then: 

Tm-tA = -Fm-t-i / R = (39 X 10^^ flop) / (10^ flop / sec) = 39 x 10^ seconds = 1100 hours. 

At this point, two key components of the NVSI come into play: 
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• A standard procedure for large-scale computing is two-pass simulation. First, a statistically- 
representative sample of die full problem domain is evaluated, marking (flagging) the nodes of 
interest, and then tlie full domain is evaluated over the flagged nodes. This is equivalent to a 
mesh-enhancement for the navigation phase. In portfolio-risk analysis, a 1000-fold reduction of the 
portfolio to a sample is well within norms. Under this compression, then, 

T^.,, (sample) = (39 x 10^) / 10^ ^ 1.1 hour. 

Assume that one in a thousand nodes (states) are flagged for full evaluation of the 

entire portfolio (because the sample portfolio value meets the client-user query criteria). Then, 

Tin t i (entire) = 1.1 hour, again. 

Thus, the total navigation time is: 

7:av = r^-M - (sample) + T^,, (entire) = 2.2 hours, for 1 CPU. 

• The netcentric component: With a typical 100BaseT LAN of 100 workstations, with a 50% 
availability (sharing-efficiency), we have: 

T^^^ (Netcentric) = (2.2 hours) / (100 • 0.5) = 158 seconds ~ 2.6 minutes. 

Using netcentric computing and problem-domain optimization, the entire portfolio can be 
navigated in about minutes, that is, in near-realtime. 

RecaU that the client-user measures performance in terms of the response to query, that is, on-demand 
access to H over all scenarios. Thus, we calculate an effective (virtual) throughput for this computation, 
optimized for portfolio risk-analysis witii 100 shared reference-CPUs (in terms of virtual-FLOPS, or 
VFLOPS) as: 

J^vsi (Risk Domain / 100) = F^.^., / = 39 Tflop / 158 sec = 25GG vflops , 
A bounded spectrum for performance (in units of Gvflops) is then: 

5 = Rnvsi (RDxi) < 25 - Rnvsi (RDxio) < 250 = R^vsi (R^xiOO) 
Note that the value for 1 CPU assumes a dedicated machine. 

Comparison with Commercial Solutions 

For comparison, if one were to simply "scale-up" existing instrument-valuation software, and run it on 
the best mainframe (dedicated hardware supercomputer) available, then a similar calculation to the 
above would involve very different computation times for different instruments. If we assume a typical 
global portfolio with IM instruments, and a mix of 50% Type 1, 40% Type 2 and 10% Type 6, then the 
total computational demand is given by: 
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(0.5 • IOOOflop + 0.4 • 70FLOP + 0.1 • 20,000flop) • 10^ + 10^ • (3 x 10^) 

= 7587 X 10'^ FLOP = 7587 Tflop . 

A Cray T3E Supercomputer has a peak (burst) throughput of ^ 2.2 Tflops, and a sustainable rate 
(which we estimate is appropriate for the 1/ O requirements of this problem space) of R 250GFLOPS. 
Therefore, even on a T3E the Risk Domain problem would require: 

7;;,, (Cray T3E) = (7587 x 10'^ FLOP ) / (250 x 10^ FLOP / sec) = 30,000 sec = 8.4 hours. 

On a more affordable (and obtainable for a financial institution) Cray T90 (for which we estimate in this 
problem space a sustainable R 18 Gflops), the problem would take nearly 5 days, and on a 
conventional high-end business mainframe, about 100 days. 

PRICE / PERFORMANCE 

The most straightforward measure of price/performance is simply cost/R. The projected cost of 
version 1.0 of WSI is - $10M. The cost of a fully-populated Cray T90 is $20M, and a fully-populated 
Cray T3E - $100M. Thus: 

P/P NVSI = ($10 X l0')/(250 X 10' vflops) = 0.00004 $/flop = 0.004 ^ / flop 

P/P C90 = ($20 X 10V(18 X 10' FLOPS) = 0.00111 $/flop 

P/P T3E = ($100 X 10V(250 x 10' flops) = 0.00040 $/flop 

Therefore, the NVSI is about 10 times more cost-effective than the only other machine that can solve 
the problem in reasonable time. And this does not even include the extended costs for the Cray machine 
of software development (including staff), machine-room support, & depreciation. 

In contrast, all of our NVSI calculations have been conservative: in MassRAM use, problem-complexity, 
structural-requirements, estimated calc\ilation times, and network sharing & resources. 

With these factors considered, we project that the cost of an NVSI implementation will be 50-100 times 
less than any comparable hardware-based commercial solution. 
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VIII. Summaiy 

What have we really achieved? In theoretical terms the three fundamental breakthroughs are: 

1. The NVSI architecture allows flexible (and extensible) reconfiguration of the system memory in 
such a way that the actual topology of the computational surface is modifiable. This means that a 
true virtual machine is constructed, one which is unique for each problem domain. One of the 
principle assertions of computability is that the more closely the architecture of the machine matches 
the architecture of the problem being solved, the more effidendy the machine will solve that 
problem. In our case we accomplish this because the core of the NVSI is a problem-domain- 
optimized solution manifold. This solution manifold has been created, or "instantiated", in system 
memory (hence the name for one of the component modules: the Instantiation Manager). In the 
case of the RiskScape application, this solution manifold is a combination of a rooted, ordered, 
unidirected graph and a set of pricing-model hypercubes. This manifold is imique to the problem of 
financial portfolio stress-testing, and is one of the principal reasons that multi-gigaflop throughput is 
attained. 

2. The power of netcentric distributed computation is brought to bear in two ways. First, we break the 
metaproblem into two independent components: that which can be pre-computed and stored in 
memory, and the real-time query process. The Population Manager takes responsibility for the 
former, and the Navigation Manager for the latter. This enables a temporal compression to be 
applied to the problem. In the case of RiskScape the solution manifold requires on the order of 

75 GigaFLOP (75 x 10^ floating-point-operations) to fiiUy populate all of the pncing vectors in state 
space. However, to query (i.e. fiilly Navigate) this manifold such that a one million instrument 
portfolio is evaluated against three million scenarios, takes less than an hour of user time. This 
theoretical performance assumes that only 10 "Pentium-Class" CPUs are dedicated to the 
Navigation process. With 100 Pentium Ills the process runs in near real-time. 

3. This is aU accomplished with off-the-shelf hardware. In the first version of the NVSI platform, the 
system wiU reqxtire one dedicated machine as the memory manager. However, we anticipate that 
with coming advances in Network Operating Systems^ this constraint will be removed. At that point, 
perhaps 18 to 24 months from the date of this writing, NVSI wiU be truly and completely a virmal 
supercomputer. Then, as the software continues to evolve, so does the "machine". 



2 We especially anticipate advances in Sun Microsystems JINF^, which enables peer-to-peer communication between 
component hardware. Sun Microsystems engineers suggest that this peer-to-peer addressing will evolve to the point that the 
RAM on any given physical machine can be direcdy addressed by other physical machines. This will mean that our NVSI 
platform will be fiilly "virtualized", requiring no dedicated hardware at all. 
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